通过使用图像文本匹配模型的使用,零光学习在计算机视觉中的应用已彻底改变。最值得注意的示例,剪辑,已广泛用于带有文本提示的零摄像分类和指导生成模型。但是,对于输入文本的措辞,夹子的零拍情况不稳定,因此有必要仔细设计所用的提示。我们发现这种不稳定性源于选择性相似性分数,该得分仅基于语义上有意义的输入令牌的子集。为了减轻它,我们提出了一种新颖的基于可解释的方法,该方法增加了损失术语,以确保剪辑专注于输入的所有相关语义部分,此外还采用了以前的作品中使用的夹子相似性损失。当通过及时的工程应用于单发分类时,我们的方法可以提高识别率,而无需进行额外的培训或微调。此外,我们表明使用我们的方法对生成模型的剪辑指导显着改善了生成的图像。最后,我们通过在对象位置进行空间条件来证明对基于文本的图像生成的新颖使用,这是需要将图像解释性热图限制在预定的边界框中。
translated by 谷歌翻译
Self-attention techniques, and specifically Transformers, are dominating the field of text processing and are becoming increasingly popular in computer vision classification tasks. In order to visualize the parts of the image that led to a certain classification, existing methods either rely on the obtained attention maps or employ heuristic propagation along the attention graph. In this work, we propose a novel way to compute relevancy for Transformer networks. The method assigns local relevance based on the Deep Taylor Decomposition principle and then propagates these relevancy scores through the layers. This propagation involves attention layers and skip connections, which challenge existing methods. Our solution is based on a specific formulation that is shown to maintain the total relevancy across layers. We benchmark our method on very recent visual Transformer networks, as well as on a text classification problem, and demonstrate a clear advantage over the existing explainability methods. Our code is available at: https://github.com/hilachefer/Transformer-Explainability.
translated by 谷歌翻译
Large language models can perform new tasks in a zero-shot fashion, given natural language prompts that specify the desired behavior. Such prompts are typically hand engineered, but can also be learned with gradient-based methods from labeled data. However, it is underexplored what factors make the prompts effective, especially when the prompts are natural language. In this paper, we investigate common attributes shared by effective prompts. We first propose a human readable prompt tuning method (F LUENT P ROMPT) based on Langevin dynamics that incorporates a fluency constraint to find a diverse distribution of effective and fluent prompts. Our analysis reveals that effective prompts are topically related to the task domain and calibrate the prior probability of label words. Based on these findings, we also propose a method for generating prompts using only unlabeled data, outperforming strong baselines by an average of 7.0% accuracy across three tasks.
translated by 谷歌翻译
Recent work attributes progress in NLP to large language models (LMs) with increased model size and large quantities of pretraining data. Despite this, current state-of-the-art LMs for Hebrew are both under-parameterized and under-trained compared to LMs in other languages. Additionally, previous work on pretrained Hebrew LMs focused on encoder-only models. While the encoder-only architecture is beneficial for classification tasks, it does not cater well for sub-word prediction tasks, such as Named Entity Recognition, when considering the morphologically rich nature of Hebrew. In this paper we argue that sequence-to-sequence generative architectures are more suitable for LLMs in the case of morphologically rich languages (MRLs) such as Hebrew. We demonstrate that by casting tasks in the Hebrew NLP pipeline as text-to-text tasks, we can leverage powerful multilingual, pretrained sequence-to-sequence models as mT5, eliminating the need for a specialized, morpheme-based, separately fine-tuned decoder. Using this approach, our experiments show substantial improvements over previously published results on existing Hebrew NLP benchmarks. These results suggest that multilingual sequence-to-sequence models present a promising building block for NLP for MRLs.
translated by 谷歌翻译
Language models can be prompted to perform a wide variety of zero- and few-shot learning problems. However, performance varies significantly with the choice of prompt, and we do not yet understand why this happens or how to pick the best prompts. In this work, we analyze the factors that contribute to this variance and establish a new empirical hypothesis: the performance of a prompt is coupled with the extent to which the model is familiar with the language it contains. Over a wide range of tasks, we show that the lower the perplexity of the prompt is, the better the prompt is able to perform the task. As a result, we devise a method for creating prompts: (1) automatically extend a small seed set of manually written prompts by paraphrasing using GPT3 and backtranslation and (2) choose the lowest perplexity prompts to get significant gains in performance.
translated by 谷歌翻译
表明多语言语言模型允许跨脚本和语言进行非平凡的转移。在这项工作中,我们研究了能够转移的内部表示的结构。我们将重点放在性别区分作为实际案例研究的表示上,并研究在跨不同语言的共享子空间中编码性别概念的程度。我们的分析表明,性别表示由几个跨语言共享的重要组成部分以及特定于语言的组成部分组成。与语言无关和特定语言的组成部分的存在为我们做出的有趣的经验观察提供了解释:虽然性别分类跨语言良好地传递了跨语言,对性别删除的干预措施,对单一语言进行了培训,但不会轻易转移给其他人。
translated by 谷歌翻译
在数字人文学科和计算社会科学中,比较两个文本体系和搜索它们在它们之间使用情况不同的单词的问题。这通常是通过在每个语料库上的训练单词嵌入,对齐矢量空间,并寻找余弦距离在对齐空间中的单词很大。然而,这些方法通常需要大量过滤词汇表表现良好,而且 - 正如我们在这项工作中所展示的那样 - 导致不稳定,因此不太可靠,结果。我们提出了一种不使用矢量空间对齐的替代方法,而是考虑每个单词的邻居。该方法简单,可解释和稳定。我们在9种不同的设置中展示了它的有效性,考虑了不同的语料库分裂标准(年龄,性别和推文作者,Tweet的时间)和不同的语言(英语,法语和希伯来语)。
translated by 谷歌翻译
谷歌的运营洪水预测系统是制定的,为机构和公众提供准确的实时洪水警告,重点是河流洪水在大型潮流的河流中。它在2018年开始运作,自从地理位置扩展以来。该预测系统由四个子系统组成:数据验证,阶段预测,淹没建模和警报分配。机器学习用于两个子系统。阶段预测采用长短期内存(LSTM)网络和线性模型进行建模。使用阈值和歧管模型计算洪水淹没,前者计算淹没程度,后者计算淹没程度和深度。本文首次提供的歧管模型提供了一种机器学习替代洪水淹没的液压建模。在评估历史数据时,所有型号都可以实现可操作使用的足够高的度量指标。 LSTM表现出比线性模型更高的技能,而阈值和歧管模型达到了类似的性能度量,以便在淹没程度上进行建模。在2021年的季风季节期间,洪水预警系统在印度和孟加拉国运营,覆盖河流的洪水区,总面积287,000平方公里,拥有350多万人。超过100米的洪水警报被发送给受影响的人口,相关当局以及紧急组织。系统上的当前和未来的工作包括将覆盖范围扩展到额外的洪水易发位置,以及提高建模能力和准确性。
translated by 谷歌翻译